Factor Analysis Based Speaker Verification Using ASR
نویسندگان
چکیده
In this paper, we propose to improve speaker verification performance by importing better posterior statistics from acoustic models trained for Automatic Speech Recognition (ASR). This approach aims to introduce state-of-the-art techniques in ASR to speaker verification task. We compare statistics collected from several ASR systems, and show that those collected from deep neural networks (DNN) trained with fMLLR features can effectively reduce equal error rate (EER) by more than 30% on NIST SRE 2010 task, compared with those DNN trained without feature transformations. We also present derivation of factor analysis using variational Bayes inference, and illustrate implementation details of factor analysis and probabilistic linear discriminant analysis (PLDA) in Kaldi recognition toolkit.
منابع مشابه
Effects of audio and ASR quality on cepstral and high-level speaker verification systems
Speech data for NIST speaker recognition evaluations has traditionally been distributed in compressed, telephone quality form, even for microphone data that was originally recorded at higher quality. We evaluate the effect that improved audio quality has for speaker verification performance, using a recently released full-bandwidth version of microphone data from the SRE2010 evaluation. Remarka...
متن کاملASR Dependent Techniques for Speaker Recognition
This thesis is concerned with improving the performance of speaker recognition systems in three areas: speaker modeling, verification score computation, and feature extraction in telephone quality speech. We first seek to improve upon traditional modeling approaches for speaker recognition, which are based on Gaussian Mixture Models (GMMs) trained globally over all speech from a given speaker. ...
متن کاملAugmenting short-term cepstral features with long-term discriminative features for speaker verification of telephone data
Short-term cepstral features have long been chosen as standard features for speaker recognition thanks to their relevance and effectiveness. In contrast, discriminative features, calculated by a multi-layer perceptron (MLP) from much longer stretches of time, have been gradually adopted in automatic speech recognition (ASR). It has been shown that augmenting short-term cepstral features with lo...
متن کاملUsing Exciting and Spectral Envelope Information and Matrix Quantization for Improvement of the Speaker Verification Systems
Speaker verification from talking a few words of sentences has many applications. Many methods as DTW, HMM, VQ and MQ can be used for speaker verification. We applied MQ for its precise, reliable and robust performance with computational simplicity. We also used pitch frequency and log gain contour for further improvement of the system performance.
متن کاملInvestigating factor analysis features for deep neural networks in noisy speech recognition
The problem of speaker and channel adaptation in deep neural network (DNN) based automatic speech recognition (ASR) systems is of substantial interest in advancing the performance of these systems. Recently, the speaker identity vectors (i-vectors) have shown improvements for ASR systems in matched conditions. In this paper, we propose the application of the general factor analysis framework fo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016